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Abstract 

Let X and Y be two simple symmetric continuous-time random walks on the 
vertices of the n-dimensional hypercube, Z2. We consider the class of co- 
adapted couplings of these processes, and describe an intuitive coupling which 
is shown to be the fastest in this class. 
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1. Introduction 

Let Z2 be the group of binary n-tuples under coordinate- wise addition modulo 2: 
this can be viewed as the set of vertices of an n-dimensional hypercube. For x G Z2 , 
we write x = (x(l), . . . , x{n)), and define elements {e^lg by 

eo = (0, . . . ,0) ; ei(fc) = i = l,...,n, 

where 1 denotes the indicator function. For x,y € TJ^ let 

n 
i=l 

denote the Hamming distance between x and y. 

A continuous-time random walk X on Z2 may be defined using a marked Poisson 
process A of rate n, with marks distributed uniformly on the set {1,2,..., n}: the i^^ 
coordinate of X is flipped to its opposite value (zero or one) at incident times of A for 
which the corresponding mark is equal to i. We write £ (Xt) for the law of X at time 
t. The unique equilibrium distribution of X is the uniform distribution on Z2 . 
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Suppose that we now wish to couple two such random walks, X and Y , starting 
from different states. 

Definition 1.1. A coupling of X and F is a process {X' , Y') on x such that 

X' = X and Y' = Y. 

That is, viewed marginally, X' behaves as a version of X, and Y' as a version of Y . 

For any coupling strategy c, write [X'f^, Y^) for the value at t of the pair of processes 
X'^ and Y'^ driven by strategy c, although this superscript notation may be dropped 
when no confusion can arise. (We assume throughout that {X'^,Y'^) is a coupling of 
X and Y .) We then define the coupling time by 

= mi{t >Q : X",^ Y^ Vs > t} . 

Note that in general this is not necessarily a stopping time for either of the marginal 
processes, nor even for the joint process. For i > 0, let 

Ut^{l<i<n: X^it) / Y,^{^)} 

denote the set of unmatched coordinates at time t, and let 

= {l<i<n: Xm = Y^%i)} 

be its complement. A simple coupling technique appears in [1], and may be described 
as follows: 

• if X{i) flips at time t, with i e Mt, then also flip coordinate Y{i) at time t 
(matched coordinates are always made to move synchronously); 

• if |C/f| > 1 and X{i) flips at time t, with i e Ut, also flip coordinate Y{j) at time 
t, where j is chosen uniformly at random from the set Ut\ {«}; 

• else, if Ut — {i} contains only one element, allow coordinates X{i) and Y{i) to 
evolve independently of each other until this final match is made. 

This defines a valid coupling of X and Y, for which existing coordinate matches are 
maintained and new matches made in pairs when |C/t| > 2. It is also an example of a 
co-adapted coupling. 
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Definition 1.2. A coupling {X'',Y^) is called co-adapted if there exists a filtration 
{^t)t>o s^ch that 

1. X'^ and are both adapted to {^t)t>o 

2. for any < s < t, 



In other words, {X'^, V^) is co-adapted if X'^ and are both Markov with respect to 
a common filtration, {J^t)t>o- Note that this definition does not imply that the joint 
process {X^, V^) is Markovian, however. If {X'^, Y^) is co-adapted then the coupling 
time is a randomised stopping time with respect to the individual chains, and it suSices 
to study the first collision time of the two chains (since it is then always possible to 
make X^ and Y^ agree from this time onwards). 

In this paper we search for the best possible coupling of the random walks X and 
Y on within the class C of all co-adapted couplings. 



In order to find the optimal co-adapted coupling of X and Y, it is first necessary to 
be able to describe a general coupling strategy c € C. To this end, let Ajj (0 < i,j < n) 
be independent unit-rate marked Poisson processes, with marks chosen uniformly 
on the interval [0, 1]. We let {^t)t>o be any filtration satisfying 



The transitions of X'^ and Y'^ will be driven by the marked Poisson processes, and con- 
trolled by a process {<3^(i)}t>o which is adapted to {^t)t>o- Here, Q'^it) = {qfj (t) : 1 <i,j,< n} 
is a n X n doubly sub-stochastic matrix. Such a matrix implicitly defines terms 
{qQj{t) : 1 < J < n} and {qfoit) : 1 < i < n} such that 



£ {X^ I .F.) = £ {X^ I X^) and C (Y,^ \Ts)=C {Y,^ \ F/) . 



2. Co-adapted couplings for random walks on Z; 



n 
'2 



a 




n 




for all 1 < j < n and t >0 , 



(2.1) 



n 



and 




for all 1 < i < n and t>0. 



(2.2) 



j=o 
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For convenience we also define q'oo(*) = for alH > 0. 

Note that any co-adapted coupling {X^, Y") must satisfy the following three con- 
straints, all of which are due to the marginal processes X''{i) {i = l,...,n) being 
independent unit rate Poisson processes (and similarly for the processes Y'^{i)): 

1. At any instant the number of jumps by the process (X°,y^) cannot exceed two 
(one on X'^ and one on F°) ; 

2. All single and double jumps must have rates bounded above by one; 

3. For alH = 1, . . . , n, the total rate at which X°(i) jumps must equal one. 

A general co-adapted coupling for X and Y may therefore be defined as follows: 
if there is a jump in the process A^ at time t > 0, and the mark Wij{t) satisfies 
Wij{t) < qij{t), then set X^ = X^_ + (mod 2) and Y^"" = Yl_ + ej (mod 2). Note 
that if i (respectively j) equals zero, then Xf = X^_ (respectively, Y^ = Y^_), since 
eo = (0,...,0). 

From this construction it follows directly that X"^ and Y'^ both have the correct 
marginal transition rates to be continuous-time simple random walks on Z2 as described 
above, and are co-adapted. 

3. Optimal coupling 

Our proposed optimal coupling strategy, c, is very simple to describe, and depends 
only upon the number of unmatched coordinates of X and Y. Let Nt = \Ut\ denote 
the value of this number at time t. Strategy c may be summarised as follows: 

• matched coordinates are always made to move synchronously (thus N'^ is a 
decreasing process); 

• if A'' is odd, all unmatched coordinates of X and Y are made to evolve indepen- 
dently until N becomes even; 

• if AT is even, unmatched coordinates are coupled in pairs - when an unmatched 
coordinate on X flips (thereby making a new match), a different, uniformly 
chosen, unmatched coordinate on Y is forced to flip at the same instant (making 
a total of two new matches). 



Optimal co-adapted coupling on the hypercube 5 

Note the similarity between c and the couphng of Aldous described in Section [TJ if A'' 
is even these strategies are identical; if N is odd however, c seeks to restore the parity 
of TV as fast as possible, whereas Aldous's coupling continues to couple unmatched 
coordinates in pairs until iV 1. 

Definition 3.1. The matrix process Q corresponding to the coupling c is as follows: 

• gjj(i) = 1 for all i e Mt and for aU t > 0; 

• if Nt is odd, qio{t) = qoi{t) = 1 for all i E Ut] 

• if Nt is even, qio{t) = qoi{t) — qii{t) — for all i £ Ut, and 

qij = jY^-j for all distinct i,j € Ut ■ 

The coupling time under c, when [Xq^Yq) = {x,y), can thus be expressed as follows: 



T = T = 



Eo+Ei+E2 + ---+Era-l+Era if \x - y\ = 2m 

Eo+Ei+E2 + ---+ Era-1 +Era+ £^2^+1 if |x - j/| = 2m + 1 , 

(3.1) 

where {£'fc}j,>g form a set of independent Exponential random variables, with Ek 
having rate 2k. (Note that Eq = 0: it is included merely for notational convenience.) 
Now define 

v{x,y,t)^P[T>t\Xo^x,Yo = y] (3.2) 

to be the tail probability of the coupling time under c. The main result of this paper 
is the following. 

Theorem 3.1. For any states x,y £ Z2 and time t >0, 

v{x, y, t) = inf V[t^ >t\Xo = x,Yo = y] . (3.3) 

In other words, f is the stochastic minimum of all co-adapted coupling times for the 
pair {X, Y). 

It is clear from the representation in that v{x,y,t) only depends on {x,y) 

through \x — y\, and so we shall usually simply write 

v{k,t) = P[f > i |7Vo = fc] , 
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with the convention that v{k,t) = for fc < 0. Note, again from (|3.ip . that v{k,t) is 
strictly increasing in k. For a strategy c G C, define the process by 

where T > is some fixed time. This is the conditional probability of X and Y not 
having coupled by time T, when strategy c has been followed over the interval [0, t] and 
c has then been used from time t onwards. The optimality of c will follow by Bellman's 
principle (see, for example, 7 ) if it can be shown that Sf^^c is a submartingale for 
all c e C, as demonstrated in the following lemma. (Here and throughout, s A t — 
min {s, t}.) 

Lemma 3.1. Suppose that for each c E C and each T G M-|_, 

(S'^^^c)Q<j<2n is a submartingale. 
Then equation p.3p holds. 

Proof. Notice that, with (Xq^Yq) = {x,y), Sq — v{x,y,T) and S^^.j.c — l[T<r^]- If 
S'.'^^^c is a submartingale it follows by the Optional Sampling Theorem that 

P [r^ > T] = E > SI = v{x, y,T)^V[r>T], 

and hence the infimum in (j3.3p is attained by c. 

Now, (point process) stochastic calculus yields: 

dS^^dZ^+(^A',v-^^dt, (3.4) 

where is a martingale, and is the "generator" corresponding to the matrix Q^{t). 
Since the Poisson processes A^ are independent, the probability of two or more jumps 
occurring in the superimposed process IJ in a time interval of length S is 0{S^). 
Hence, for any function / : Z2 x x R+ — * R, satisfies 

f{x + ei,y + ej,t) - f{x,y,t) . 

i=0 j=0 



A'if{x,y,t) = Y,Y.lU^) 
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Setting f — V gives: 



n n 

= ^(1^ -v + e., + ej\,t) ~ v{\x -y\,t) 



In particular, since i) is invariant under coordinate permutation, if = \x — y\ = k 
then 



A1v{x^y,t) ~ A((fc,A; + m) v{k + niji) ~ v{k,t) 



m=~2 



(3.5) 



where A((A:, k + m) is the rate (according to Q'^{t)) at which A^j^^ jumps from k to k + m. 
More exphcitly. 



A?(fc,fc + 2)- ^ g^^.(i), X1{k,k + 1)= , (3.6) 



ieMt 



A^(fc,fc^2)= ^ gf^.(t), A^(fc,fc-1) = 5] (g^o(t)+gS,(0) , (3.7) 



and 



A?(fc, fc) = ^ (qf/t) + ql,[t)) + E ■ 



(3.8) 



It follows from the definition of Q and equations (|3.6p to (|3.8|) that these terms must 
satisfy the linear constraints: 



A^(fc,/c-2) + iA^(fc,fc-l) <fc, and 

Xlik, k-2) + iA^(fc, fc - 1) + A^(fc, k) + ^Xtik, k + l) + XUk, k + 2)=n. 
Denote by L„ the set of non-negative A satisfying the constraints 



A(fc,A: - 2) + -A(fc,A; - 1) < fc, and 



(3.9) 



A(fc, fc - 2) + iA(fc, fc - 1) + A(fc, fc-) + ^X{k, fc + 1) + A(fc, fc + 2) = n . (3.10) 



Returning to equation 



dS^ = dZ^+[A^,v-^]dt. 
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We wish to show that S'j^^^.c is a submartingale for all couplings c e C We shall do 
this by showing that A^v is minimised by setting c = c. This is sufficient because S^^^ 
is a martingale (and so A^v — dv/dt — 0). Now, from equation (|3.5p we know that 

2 

Atv{k,t) = ^ X1{k,k + m) v{k + ni,t) - v{k,t) . 
Thus we seek to show that, for all fc > and for alH > 0, 



max X{k,k + m) v{k,t) — v{k + rn,t) 



> 0. 



(3.11) 



For each t, this is a linear function of non-negative terms of the form X{k,k + m). 
Thanks to the monotonicity in its first argument of v, the terms appearing in the left- 
hand-side of (jS.lip are non-positive if and only if m is non-negative. Hence we must 
set 

X{k,k + 1) = X{k,k + 2) = (3.12) 



in order to achieve the maximum in (|3.1ip . 
It now suffices to maximise 

A(fc, A: - 1) v{k, t) - v{k - 1, t) + A(fc, k ~ 2) v{k, t) ~ v{k - 2, t) 



(3.13) 



subject to the constraint in (|3.9p . 

Combining (|3.9p and ()3.13p yields the final version of our optimisation problem: 



maximise A(fc, /c — 1) ^ z)(fc, t) — v{k — 1, t) 
subject to < A(fc, A: - 1) < 2fc . 
The solution to this problem is clearly given by: 



v{k,t) - v{k - 2,t) 



(3.14) 
(3.15) 



A(A:,fc- 1) = 



2k if 



v{k, t) - v{k - 1, t) > i w(fc, t) - v{k ~ 2, t) 



(3.16) 



otherwise . 
These observations may be summarised as follows: 
Proposition 3.1. For A G L„, the maximum value of 



X{k,k + 'm) v{k,t) — v{k + m,t) 



m=-2 
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is achieved at A*, where A* satisfies the following: 



A*(fc,fc + l) = A*(fc,/c + 2) = 0; 
A*(fc,fc-2) + iA*(fc,fc-l) = 



X*{k,k-l) = 



2k if 



v{k, t) - v{k -l,t) > i v{k, t) - v{k - 2, t) 







otherwise . 



Our final proposition shows that A*(fc, /c — 1) = 2fc if and only if k is odd. 
Proposition 3.2. For any fixed t > 0, 

2 v{k,t) -v{k-l,t) - v{kA) ~v{k ~2,t) >0 if k is odd, and (3.17) 
2 v{k,t) -v{k-l,t) - v{k,t) -v{k -2,t) <0 if k is even. (3.18) 

Proof. Define Va by 

r°° 1 

T4(fc) = / e-"*?}(fc,i)dt = - (1 -E [e-"^]) . 
Jo ^ 

We also define c?(/c, t) — v{k, t) — v{k — 1, t), and for a > let 

/•oc 

Da{k) = / e-°'*d{k,t)dt 
Jo 

be the Laplace transform of d{k, •). Given the representation in equation (|3.ip of f as 
a sum of independent Exponential random variables, it follows that 



V^{k) = < 























-. 





2i 



2i + a 



2(2to+1) ^ 2i 

1) + a 2i + a 



2{2m 



if fc = 2m 



if /c = 2m + 1 . 



(3.19) 



To ease notation, let 



2i 



2i + a 



The following equality then follows directly from consideration of the transition rates 
corresponding to strategy c: 
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for all a > and m > 1, 



1 - ay„(2m) + 2m Va{2m - 2) - Va{2m) 



2m 



2m + a 



0q(to) + — [(t)a{m) - (j)a{m- 1)] 
(t>a{m) 
0. 



2m , , 
a 



2m 



Similarly, 



1 - aVa(2m - 1) + 2(2m - 1) Va{2m - 2) - Va{2m - I) =0 



(3.20) 



(3.21) 



Now suppose that k — 2m, and hence is even. We wish to prove that 

d(2m - l,t) - d(2m,i) > for alH > , 

which is equivalent to showing that Da{2m — 1) — Da{2m) is totally (or completely) 
monotone (by the Bernstein- Widder Theorem; Theorem la of |3j, Ch. XIII. 4). 
We proceed by subtracting equation (I3.2ip from (|3.20p : 



Va{2m) - Va{2m ~ 1) + 2m Va{2m - 2) ~ Va{2m) 

+ 2(2to - 1) \%{2m - 1) - K,(2m - 2) 
= -aDa{2m) - 2m [Da{2m) + Da{2m - 1)] + 2(2m - l)L»„(2m - 1) , 



and so 



Da{2m - 1) - Da{2m) 



2 + a 



Da{2m) . 



2m -2 

It therefore suffices to show that (2 + Q;)£'Q,(2m) is completely monotone. 
Now note from the form of V in equation p.l9|) , that 

(2 + a)Da{2m) = 2e„(2m) , 



(3.22) 



where 0Q(2m) is the Laplace transform of 



e{2m,t) = P 



i=0 



Ei + E2,n-1 > t 



i=0 



where {£'i}j>o form a set of independent Exponential random variables, with Ei having 
parameter 2i. But since 9{2m,t) is strictly positive for all t, it follows that 
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(2 + a)Da{2m) is completely monotone, as required. This proves that, for any fixed 
t > 0, 



v{k,t) -v{k~l,t) - v{k,t) - v{k -2,t) 



< 



(3.23) 



whenever k is even. Thus inequality (j3.18p holds in this case. 

Now suppose that k — 2m + 1, and hence is odd. In this case we wish to show that 
inequality (|3.17p holds, which is equivalent to showing that Da{2m + 1) — Da{2m) is 
completely monotone. Now, substituting m + 1 for m in equation p.2ip yields 

l-aK,(2m + l) + 2(2m + l) [v;,(2m) ~ f„(2m + 1)1 =0. (3.24) 

Proceeding as above, we subtract equation (|3.20p from p.24p : 



= -a 



K(2m + 1) - \/„(2m) +2(2m+l) Va{2m) ~ V^{2m + 1) 



2m 



Va{2m) - Va{2m - 2) 
= -aDa{2m + 1) ~ 2(2to + l)Da{2m + 1) + 2m [Da{2m) + Da{2m - 1)] . (3.25) 

Then it follows from equation (|3.22p that 



(2to - 2)Da{2m - 1) = (2m + a.)Da{2m) . 
Substitution of equation (|3.26p into (|3.25p gives 

= (4m + 2 - a) [i:'Q(2m) - Da{2m + 1)] + 2 [Da{2m - 1) - Da{2m)] , 



(3.26) 



and so 



Da{2m + 1) - Da(2m) = 



4m + 2 + a 



[D„(2m - 1) - Da (2m) 



(3.27) 



But, since we have already seen that Da{2m — 1) — Da{2m) is completely monotone, the 
right-hand-side of equation (j3.27p is the product of two completely monotone frmctions, 
and so is itself completely monotone [3] , as required. 

Now we may complete the 

Proof of Theorem \3.1\ Thanks to Lemma 13.11 and Proposition 13.11 Proposition 13.21 
along with equations (|3.12p and p.l6p . shows that any optimal choice of Q{t), Q*{t), 
is of the following form: 
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• when Nt is odd: 

q'toit) = Imit) = 1 for all leUt, (and so X;{Nt,Nt - 1) = 2Nt) , 
q*^{t) = 1 for all i £ M* ; 

• when Nt is even: 

qUt) - Qmit) = qUt) = for all i e Ut, (and so \l[Nt,Nt - 1) = 0) , (3.28) 
ql{t) = 1 for all i £ Mt . 

This is in agreement with our candidate strategy Q (recall Definition 13. ip . From 
equation (|3.28p it follows that the values of q*j{t) for distinct i, j G Ut must satisfy 



but are not constrained beyond this. Our choice of 

satisfies this bound, and so c is truly an optimal co-adapted coupling, as claimed. 

Remark 3.1. Observe that when fc = 1, equation (|3.ip implies that ^(Ijt) ~ v{2,t) 
for all t. The optimisation problem in (|3.14p and (|3.15p simplifies in this case to the 
following: 

maximise A(l,0)w(l,i) (3.29) 
subject to iA(l,0) + A(l,l) + iA(l,2) < n. (3.30) 

As above, this is achieved by setting A(1,0) = 2. Note from equation ()3.30p . however, 
that when fc = 1 there is no obligation to set A(l, 2) = in order to attain the required 
maximum. Indeed, due to the equality between {)(1, t) and {)(2, t), when A: = 1 it is not 
sub-optimal to allow matched coordinates to evolve independently (corresponding to 
Aj (1, 2) > 0), so long as strategy c is used once more as soon as /c = 2. 



4. Maximal coupling 



Let X and Y be two copies of a Markov chain on a countable space, starting 
from different states. The coupling inequality (see, for example, ^) bounds the tail 
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distribution of any coupling of X and Y by the total variation distance between the 
two processes: 



GrifFeath [51 showed that, for discrete-time chains, there always exists a maximal 
coupling of X and Y: that is, one which achieves equality for alH > in the coupling 
inequality. This result was extended to general continuous-time stochastic processes 
with paths in Skorohod space in [11 . However, in general such a coupling is not 
co-adapted. In light of the results of Section [3l where it was shown that c is the 
optimal co-adapted coupling for the symmetric random walk on , a natural question 
is whether c is also a maximal coupling. 

This is certainly not the case in general. Suppose that X and Y are once again 
random walks on Zj, with Xq = (0,0, ...,0) and Yq = (1,1,...,!): calculations 
as in [2] show that the total variation distance between Xt and Yt exhibits a cutoff 
phenomenon, with the cutoff taking place at time T„ = i logn for large n. This implies 
that a maximal coupling of X and Y has expected coupling time of order r„. However, 
it follows from the representation of f in equation (13. ip that 



It follows that c is not, in general, a maximal coupling. 

A faster coupling of X and Y was proposed by [3]. This coupling also makes new 
coordinate matches in pairs, but uses information about the future evolution of one of 
the chains in order to make such matches in a more efficient manner. This coupling 
is very near to being maximal (it captures the correct cutoff time), but is of course 
not co-adapted. Further results related to the construction of maximal couplings for 
general Markov chains may be found in [U [SI [TU] . 



\\C{Xt)-C{Yt)\\^y<V[T>t] . 



(4.1) 



E[f ; \Xo~Yo\^n 



2m] = E[Ei+E2 + --- + E,n-i + E^] ^ - log(n) . (4.2) 
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